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■ Abstract. The Shahshahani geometry of evolutionary game the- 

^vq | ory is realized as the information geometry of the simplex, deriving 

from the Fisher information metric of the manifold of categorical 
q | probability distributions. Some essential concepts in evolutionary 

game theory are realized information-theoretically. Results are ex- 
tended to the Lotka-Volterra equation and to multiple population 
systems. 
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O . 1. Introduction 

The replicator equation is a widely-used model of natural selection. 
>• ■ This paper explains the realization of the geometry of evolutionary 

game theory in terms of information theoretical principles, giving a 
^ ! purely mathematical and statistical origination of the replicator equa- 

tion. Under this interpretation, the replicator equation models the 
information dynamics of a population of replicating entities. Addi- 
tionally, the Kullback-Liebler information divergence, which serves as 
a Lyapunov function for the replicator dynamic, can be interpreted as 
a measure potential information, characterizing the concept of evolu- 
tionary stable state informatically. 



X 

1.1. Continuous Replicator Dynamic. Consider a categorical dis- 
tribution X on n categories of entities in a population. This is dis- 
crete probability distribution represented by a unit vector of n vari- 
ables ) under the normalization \x\ = x± + ■ ■ ■ + x n = 1, 
where Xi denotes the proportion of the i-th type in the population. The 
replicator equation on this distribution is the differential equation 

Xi Xi i^fiix) f {x)^j , 
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where f(x) = (fi(x), . . . , f n (x)) is a fitness landscape and f(x) = 
x ifi ( x ) + • • • + x n f n (x) is the mean fitness. 

2. Geometric Aspects of Evolutionary Game Theory 

The information theoretic interpretations of the previous chapter 
have a unified basis in information geometry. We begin with a descrip- 
tion of the geometry of the simplex and geometric results known in 
evolutionary game theory. 

2.1. The Geometry of the Simplex. Let S n be the interior of the 
n-simplex A n , which is (n — l)-dimensional. Each point x of the sim- 
plex has the property that xi + - ■ -+x n — 1, so the tangent space at any 
point on the interior is the (n — l)-dimensional vector space described 
by n vectors v i , . . . , v n such that v 1 + ■ • • +v n — 0. The orthogonal com- 
plement of the tangent space is the one dimensional line with direction 
vector 1 = (1,1,. ..,1). Indeed 1 • v = for any v in the tangent space 
and the complement is necessarily one-dimensional. The faces of A n 
are isomorphic to a simplex of one lower dimension, which can be seen 
by setting one of the x^ to zero and indicates the absence of that type 
in the population. The replicator equation is forward-invariant on the 
simplex (and hence each of its faces), since if xi = then Xi = 0. Be- 
cause of this property, the replicator equation is called non-innovative 
since new types cannot arise, in contrast to evolutionary dynamics in 
which this is possible (notably the replicator-mutator equation [10] and 
the orthogonal projection dynamic [12]). 

2.2. Shahshahani Geometry. Shahshahani introduced two Riemann- 
ian manifolds into mathematical biology [13]: the positive orthant of 
R n , denoted W", , with the metric 



where |x| = Yli x i an d the restriction to the simplex A n = {x G 
M n | \x\ =l,Xi> OWi} C R™, with the metric 



Call the latter manifold the Shahshahani manifold; its metric is known 
as the Shahshahani metric. There is a normalization map N : — > 
A n given by x i— > t^t. For each r G R+, there is a map tp T mapping the 
simplex into R" by x i— > tx. These maps are sections of the normaliza- 
tion map since N o(p T = id&n. The Shahshahani metric diverges on the 
boundary of the simplex so the metric is valid only on the interior S n . 



9ij( x ) 




9ij( x ) 
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Dynamics that are forward- invariant, such as the replicator dynamic, 
are not affected by the discontinuity at the boundary. 

2.3. The Replicator Dynamic, Geometrically. The geometry of 
the Shahshahani manifold yields an elegant interpretation of the repli- 
cator equation: it is the gradient flow of the Shahshahani metric. 
Shahshahani proved the result for a special case of the replicator equa- 
tion; the following more general theorem comes from [8]. 

Theorem 1. // the differential equation xi = fi(x) is a Euclidean 
gradient with fi — ^ then the replicator equation Xi = f{x) = 
Xi(fi{x) — f(x)) is a gradient with respect to the Shahshahani metric. 

In the case that the fitness landscape is a Euclidean gradient, the 
Shahshahani gradient gives a Lyapunov function for the dynamic. The 
classical case is that of a symmetric matrix A and fitness landscape 
f(x) = Ax, where A is the matrix of Malthusian fitness parameters 
given by the difference in birth rates and death rates = bij — dij of 
an individual having alleles % and j, where the alleles are of a single 
gene locus. In this case the Shahshahani potential is the mean fitness 
\x ■ f(x) = \x ■ Ax, with Ax the Euclidean gradient [8, 13]. 

2.4. Fisher's Fundamental Theorem and Kimura's Maximal 
Principle. Fisher's fundamental theorem is a consequence of the geo- 
metric approach. 

Theorem 2. The rate of change of the Shahshahani potential is equal 
to the variance of the fitness landscape [8]. 

V(x) = Var x [f(x)}. 

This is a general version of Fisher's Fundamental Theorem of Nat- 
ural Selection, specializing to the traditional result in the case of a 
symmetric and linear fitness landscape [8]. Kimura's maximal prin- 
ciple follows from the fact that the replicator equation is a gradient 
flow [13]. As these are both important results in mathematical biol- 
ogy emerging from the geometry, an interpretation is desired of the 
Shahshahani metric that provides intuition for the introduced geome- 
try on the simplex in the context of modeling natural selection. 

3. The Information Geometry of Natural Selection 

An intuitive interpretation of the Shahshahani geometry is provided 
by information theory. Information geometry [5] studies manifolds of 
probability distributions p(s, x) on a set S depending on parameters x, 
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which are the coordinates of the manifold, 
with the Fisher information metric, 



The manifold is endowed 



9i3 [ 



X 



E 



d\ogp d\ogp 

dx % dxi 



which can be shown to be the unique (up to a constant) metric respect- 
ing sufficient statistics [7]. 



3.1. The Fisher Information Metric is the Shahshahani Metric. 

The manifold of immediate interest is P(X), the set of categorical 
probability distributions on a finite set X, with the Fisher information 
metric. In this case, it is convenient to abuse notation by allowing the 
parameters x and distribution variables s to have the same symbolic 
representation. Q There is a natural mapping ip : P(X) — > A™, where 
\X\ = n, given by p — > (p(l),p(2), . . . ,p(ri)) = (xi, . . . , x n ). This maps 
P{X) isometrically onto the simplex, and an easy computation shows 
that that the Fisher information metric is induced by the Shahshahani 
metric under this mapping. Simply observe that 



9iA x ) 



E 



d log x d log x 

dx l dxi 



EXk—Sik— 

%*J f tXj f) 



1 



<jk 



X 



This result was recognized in [4] and [6]. 



3.2. Fisher's Fundamental Theorem. Fisher's fundamental theo- 
rem is built into the geometry of P(X) [5]. Define the maps E[g] on 
P{X) by p t— > E p [g], where g is from the set of functions M x = {g : 
X — > M} and E p is the mean taken at the distribution p. Similarly, let 
V p [g] denote the variance of the function g at p. 

Theorem 3. For any g : X — > R G M x , 

V p =\\(dE[g]) p \\l=\\(VE[g]U\l, 

where the norm is induced by the Fisher information metric, for all 
P eP{X). 



Different coordinates are sometimes chosen in information geometry for P(X), 
letting (for instance) x n +\ = 1 — 53ILi x % to enforce J^. xi = 1. This yields an 
asymmetric set of replicator equations, so a different set of coordinates is chosen in 
this exposition. 
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3.3. Information Divergences and Metrics on P(X). Some Rie- 
mannian metrics on P(X) can be derived from information diver- 
gences [5]. Information geometry defines an information divergence 
as a smooth function -D(-||-) : P(X) x P(X) such that D(x\\y) > 
with equality iff x — y. The second order Taylor expansion in either 
variable evaluated along the diagonal x = y begins with the Hessian 
term H. Indeed, 

D(x\\y) = D(x\\y)\ x=y + (V D(x\\y)\ x=y ) .(x-y) + ^(x- y) T H(x) ■ (x 
= + + i • (x - yfH(x) .( x -y) + ... 

because the gradient is parallel to 1 and 1 • (x — y) = 1 ■ x — 1 ■ y = 
1-1 = 0. 

In the case that the Hessian is positive definite, it can be used to 
define a metric, 

< D) = (PD_\ 
9 " \dx.0yj 
A metric then defines a gradient flow, hence a global information 
divergence yields a dynamical system on the simplex. Importantly, the 
Hessian of the Kullback-Liebler divergence (in either variable, evalu- 
ated on the diagonal) is the Fisher information matrix, yielding the 
local to global connection of these two measures of information. 

Example 4 (Kullback-Liebler Divergence). The Kullback-Liebler di- 
vergence localizes to the Fisher information metric. In coordinates we 
obtain the Shahshahani metric since 

Hence the induced gradient flow is the replicator equation. This allows 
the interpretation of Fisher's Fundamental theorem and Kimura's max- 
imal principle in terms of Fisher information: natural selection forms a 
gradient with respect to an informatic measure, and hence locally has 
the direction of maximal information increase. The rate of change of 
the mean fitness of the population is given by the informatic variance. 

3.4. Kullback-Liebler Divergence is a Lyapunov function for 
the Replicator Dynamic. The following theorem shows that the 
Kullback-Liebler information divergence forms a Lyapunov function for 
the replicator dynamic, given an evolutionarily stable state. In fact, 
evolutionary stability is characterized by this property. A version of 
this theorem was proved in [1] and in [3]. 
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Theorem 5. The state x is an interior ESS for the replicator dynamic 
if and only if Dkl{x\\x) is a local Lyapunov function. 

Proof. Let V(x) = Dkl{x\\x) = ^2 i XilogXi — J2i xi log 2^. Then we 
have that 



V( x ) = - = ~^2 x i(fi( x ) - f( x )) 



-^2 x ifi( x ) +^2 x ifi x ) = - ^2 x ifi( x ) + [ ^2 x i ) 

i i i \ i / 

~^2 x ifi( x ) + f(x) = -(x- f(x) - x- f(x)) < 0. 



/(*) 



The last inequality holds if and only if x is an ESS. Finally, by Jensen's 
inequality, D^l is minimized when Xj — Xj - so it is a local Lyapunov 
function. □ 

A similar result is proven in [8], with the Lyapunov function V(x) = 
Y\i X* 1 , but the informatic origin is not apparent in this form, although 
the quantity V can be interpreted as the probability of finding a cat- 
egorical distribution on x in the state x. The quantity Dkl(x\\x) can 
be described as the potential information of the replicator system. The 
above result can then be interpreted information theoretically - natural 
selection acts to minimize the potential information. 

Theorem O holds for a class of ecological dynamics. A dynamic of the 
form ii = Xigi(x), i — 1, . . . ,n (an ecological dynamic) is called aggre- 
gate monotone with respect to a fitness landscape / if g = (gx, . . . , g n ) 
has the property that y-f(x) > z-f(x) if and only if y-g(x) > z-g(x), for 
all distributions x,y, z. An aggregate monotone dynamic is the replica- 
tor dynamic up to a change in velocity [11]. In particular, the replicator 
equation with a convex function applied to the fitness landscape is ag- 
gregate monotone. Consider the following extension of Theorem [5j 

Theorem 6. For an aggregate monotone ecological dynamic Xi = Xigi(x) 
Dkl{x\\x) is a Lyapunov function for the dynamic if x is an interior 
ESS. 

Proof. Let V(x) = Dkl(x\\x) = J^Xjlogaij — ^a^logXj. Note that 
since Xi = Xigi(x) is a dynamic on the simplex, = Yli x i9i( x ) = 
x ■ g{x). Then we have that 



v{x) - - Xi— = xiQiix) 

Xi 

i i 

= —x ■ g(x) = —(x ■ g(x) — x ■ g(x)) 
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Applying aggregate monotonicity to the last equality completes the 
proof. □ 

Since a change of velocity does not alter the orbits of the dynamic, 
Theorem [6] shows that the replicator equation is essentially the only 
aggregate monotone ecological dynamic in which evolutionary stability 
corresponds to minimizing the Kullback-Liebler divergence. For ex- 
actly which class of evolutionary dynamics this property holds for is 
an open question. From the proof it is clear that the assumption of 
aggregate monotonicity is too strong for a full characterization since it 
is only needed that x ■ g(x) — x ■ g(x) > if x ■ f(x) — x ■ f(x) > 0, 
which quantifies over two distributions rather than three. 

3.5. Exponential Families as Solutions of the Replicator Equa- 
tion. The exponential map on the Shahshahani manifold is 



exp(x, v)=}^ Y~x~e 



where Si is the i-th coordinate vector [6]. The exponential map reduces 
to the exponential family at the barycenter b = (-,...,-), 



l e Vi 



The solutions of the replicator equation can be realized as exponen- 
tial families [2,6,9]. Let Xi = exp(t>.; — G) with vi = fi(x) and G(x) 
a normalization constant to ensure that the distribution sums to one. 
From the fact that Y2i Xi — 1, — ^ Xi and so 

= 2j^i = y]exjp(vj(x) - G(x))(vi(x) - G(x)) 

i i 

= ^ x i{Vi( x ) ~ G{x)) = ^ ( x ifi( x )) ~ G ( x ) 
i i 

= f(x) - G(x) 

Hence G = f(x). Now X{ satisfies 

Xi = exp(vi(x) - G(x))(vi(x) - G(x)) = x^f^x) - f(x)), 

which is the replicator equation. In the case of a log-linear fitness land- 
scape, explicit solutions can be derived [6]. In this case, the equation 
for the variable v can be reduced to a linear differential equation, which 
can be solved with eigenvalue methods. 
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3.6. Denormalization. Information geometry defines the denormal- 
ized manifold P(X) = {rp\r G G P(X)}, which can be thought 
of as non- normalized discrete probability distributions. As with P(X), 
P(X) has an information metric. The denormalized manifold embeds 
into the reals as M™ , with the denormalized information metric induced 
by the the metric given by Shahshahani, where the mapping back onto 
P(X) realizes r as In coordinates, the metric is given by 

T 

gij(x) = Tgij{x) = —Sij, 

Akin uses the metric 

9ij( x ) = 

Xi 

on M™ rather than the metric 

9ij(x) = -^-Sij 

given by Shahshahani [2,13]. Both metrics restrict to the same met- 
ric given by Shahshahani on the simplex. From the point of view of 
information geometry, the metric given by Shahshahani is the natural 
choice. The choice affects the form of the gradient on R™, which is in 
the case of Akin's metric is the Lotka-Volterra predator-prey equation. 

3.7. The Lotka-Volterra Equations and the Replicator Equa- 
tion. The Lotka-Volterra equations 

(1) ±i = Xifi(x) 

descend from R™, through a normalization map onto the simplex, to 
a replicator equation with an altered landscape. To see this, let |x| = 
x\ + • • • + x n , Xi = Xifi(x) and yi = P?. Rearrange to \x\yi = X{ and 

note that = s ^ Ji ii = Ylii x ifi{ x ) = x ' f( x )- By the product rule, 

f t \x\yi + \x\yi = Xi and so 

d Xi ^r|x|yj 



iVi 



dt \x\ 

_ Xjfj(x) - x ■ f(x)yj 
\x\ 

= Vi{fi{ x ) - y ■ fi x )) 
= Vi{gi{y) -y ■ g(y)), 

where gi(y) = fi(x) is an alteration of the fitness landscape. 
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The Lotka-Volterra equations are the gradient flow with respect to 
the metric given by Akin on K™ [8]. The gradient of the metric given 
by Shahshahani differs by a factor of |x|: 



(2) i i = f^/ i (x), 



This system is transformable to Equation [T] after a change of velocity 
eliminating the scalar function B(x) = o because B is strictly pos- 
itive on . Equation [2] transforms to a replicator equation via the 
normalization map [13]. 

The Lotka-Volterra equations can be interpreted as a gradient of 
the denormalized Fisher information metric, in the case that / is an 
Euclidean metric, in analogy to the replicator equation. This allows 
denormalized analogues of earlier results, such as the following, which 
is true for the denormalized version of the Lotka-Volterra equation. 

Theorem 7. Let x in W] be such that 



x ■ f(x) > x ■ f(x) 

in some neighborhood of x (a denormalized ESS). Suppose that the 
trajectory of Xi = lies in a set that contains no point parallel 

to x. Then the denormalized Kullback-Liebler divergence 




is a local Lyapunov function for Equation [H 

Proof. The divergence is minimal (and equal to zero) when x = cx for 
some constant c. Hence if the line through the origin and the point x, 
intersects the the trajectory at most once, the divergence is zero if and 
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only if x — x. The time derivative is 



d_ 

It 



D K1 I — — 



\x\ \x\ 



d 

- — 

dt 



^2 TT-fto&Xi - log kl 



EXj 1 . . x • f{x) 

"Pm i r/u^J "I i i2 

% 

\x\ \x\ \x\ 2 



\x\ 



x ■ f(x) x ■ f(x) 



\x\ 



\x\ 



< 0. 



□ 



4. Informatics of Multiple Population Replicator 

Dynamics 

The information-theoretic approach easily extends to multiple pop- 
ulation replicator equations such as bimatrix games. As before, the 
potential information plays a crucial role. It is the sum of the poten- 
tial informations of all populations that plays the role of the Lyapunov 
function and gives rise to the geometry. It suffices to discuss the two 
population case as it is clear that the results extend inductively to 
finitely-many populations. 



4.1. Two Populations. Consider two categorical distributions p = 
(pi, . . . ,p n ) and q = (q n +i, • • • , q n +m) with fitness landscapes f(p, q) = 
(fi(p, ?),•••, fn(p, q)) and g(p, q) = (g n +i(p, ?),•••, 9n+m(p, q))- Define 
the coupled replicator system 



Pi = Piifiip, q) - e p [f(p, q)]) 
4j = qj(9j(p,q)- E g [g(p,q)}) 



where i runs from 1 to n and j runs from n + 1 ton + m. Note carefully 
that the expected values are taken with each distribution respectively. 
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This system is the gradient flow of the Riemannian metric defined 
on the interior of A n x A m given by 

{— if % — j < n 
f. Hi = j>n 
else 

That is, the matrix is the direct sum matrix of the usual metric for 
each equation. As in the single population case, we can use potential 
information to form a Lyapunov function for the system. Given states p 
and q, let L be the sum of the potential information of each categorical 
distribution. That is, 

L = D p (p,p) +D q (q,q) 

i i j j 

The metric can be obtained as the localization of the sum of the 
divergence functions. All the usual calculations follow from the fact 
that the system is a gradient, e.g. Fisher's Fundamental Theorem. 



4.2. Potential Information is a Lyapunov Function. Recall that 
p is an ESS in the single population case if p • f(p) > p ■ f(p) for all p 
in a neighborhood of p. 

Theorem 8. If p and q are ESS for each system respectively then L is 
a Lyapunov function for the coupled system. 

Proof. A straight-forward computation shows that (up to a negative) 

L = p- f(p, q)~p- f(p, q) + q- g(p, q) - q ■ g(p, q). 

L is positive everywhere and has minimum at (p, q) . Since p and q are 
ESS, L < 0, so L is a local Lyapunov function. □ 

Notice that the hypothesis that both p and q are ESS is too strong. 
Indeed, all that is required is that 

p ■ f(p, q) + q- g(p, q)>p- f(p, q) + Q- g(p, ?)• 

Call this condition a coupled ESS (as well as its obvious higher dimen- 
sional analogs) and note that any ESS is a coupled ESS. Then L is a 
Lyapunov for the system if and only if (p and q) is a coupled ESS for 
the two population system. 
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4.3. Solutions. We can again show that the solutions are exponential 
families. Let i) = f(p,q) and w = g(p,q)- Let N = E p [f(p, q)] and 
M = E q [g(p, q)]. Then pi = exp (vi — N) and qj = exp (wj — M) is a 
solution to the coupled system. Indeed, 

Pi = exp ( Vi - N)(vi — N) — Pi(fi(p, q) - E p [f(p, q)}), 

and similarly for qj. 

AA. Multiple Populations. The above generalizes by induction to 
show that for a coupled system of multiple interacting populations, the 
sum of the respective potential informations gives a Lyapunov function 
for a coupled ESS. 

5. Discussion 

The Shahshahani geometry can be interpreted within the framework 
of information theory as the information geometry of the simplex. This 
explains the origin of several quantities in evolutionary game theory 
including the Shahshahani metric and the Kullback-Liebler informa- 
tion divergence. An important feature of the approach is that the 
information-geometric reasoning extends to the Lotka-Volterra equa- 
tion and the multiple population replicator equation easily within the 
framework. Additionally, the replicator dynamic arises intuitively from 
purely mathematical and statistical concepts such as Fisher informa- 
tion. This shows that the replicator equation models the information 
dynamics of natural selection. 
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